Method and apparatus for extending a file size beyond a file size limitation

ABSTRACT

A method and apparatus for extending the size of a file beyond a file size limitation of a computer. In one embodiment, a storage area is randomly selected from among a plurality of available storage areas. A determination is made as to whether the selected storage area contains at least a predetermined amount of free space. If so, the predetermined amount of free space is allocated on the selected storage area to create an allocated storage area. A symbolic link to the allocated storage area is written in a directory associated with the virtual storage volume. Data destined for the virtual storage volume is then written in the allocated storage area.

FIELD OF THE INVENTION

[0001] The present invention relates generally to file systems forcomputers, and more particularly, to a method and apparatus forextending the size of a file beyond a file size limitation imposed by acomputer file system.

BACKGROUND OF THE INVENTION

[0002] An operating system is a program that is used to manage otherprograms (i.e., application programs) in a computer system. In a typicalcomputer system, the operating system is initially loaded into thecomputer by a boot program. Once loaded, the operating system canperform a number of services for the application programs. Such servicesinclude determining the order in which certain applications can run,managing the sharing of resources (e.g., memory) between theapplications, and managing input and output to and from hardwaredevices, such as disk drives. The application programs make use of theoperating system by requesting services, for example, through the use ofan application programming interface (API). A user of the computersystem can also interact directly with the operating system, forexample, through the use of a graphical user interface (GUI).

[0003] With respect to managing input and output between the applicationprograms and one or more hardware devices, each type of operating systemis typically closely related to and may be designed to work with aspecific file system that manages the data on the disk drives. Someexamples of operating systems include Unix, Linux (a variant of Unix)and Windows.

[0004] A file system typically specifies a convention for naming files,including, for example, the maximum number of characters in a file name,the type of characters that can be used, the format of file extensionsthat are permitted, etc. The file system also specifies the algorithmicor logical locations where a file can be placed. Windows and Unix-basedoperating systems typically employ file systems that use a hierarchicalor tree-like structure wherein a file is placed in a directory orsubdirectory located at a particular position in the hierarchicalstructure.

[0005] Depending on the addressing structure used, a file system canpossess two different but related constraints: a limitation on themaximum size of an individual file, and a limitation on the maximum sizeof the file system itself.

[0006] The size of an individual file may be physically limited by thenumber of bits used in describing an address space of the file. Forexample, some versions of Linux, which were designed for use with ahardware architecture of 32 bits, use a four byte integer to address thecontents of a file. Thus, the maximum size of a file is limited to 2³¹bytes minus some number of bytes, i.e., about 2 gigabytes.

[0007] The size of the file system itself is also limited. While theaddress space of a file system is typically represented by all or partof an eight byte integer, the maximum size of the file system is usuallyset to a predetermined limit. Limiting the maximum size of the filesystem provides a number of advantages. For example, the computer systemmay require less memory and may be able to locate files faster than ifthe file system were larger. On older Linux systems, for example, themaximum size of the file system was set at one terabyte. That is,regardless of the size of a physical disk, the disk must be divided intoa plurality of partitions, each of which is less than or equal to themaximum allowable size of the file system.

[0008] In many applications, particularly seismic applications, amaximum file size of, for example, 2 gigabytes can be very restrictive.Seismic work typically involves the processing of large volumes ofseismic data. These volumes of data often span hundreds of magnetictapes, and can be several hundred gigabytes in size. Typically, asseismic data tapes are entered into a system, the contents of severaltapes are copied to multiple files in one directory or data storagearea. Then, when that data storage area becomes full, subsequent tapesare copied to many other data storage areas, which can be scatteredaround the computer system. Because the maximum size of the file systemmay be limited, for example, to 1 terabyte, one data storage area maynot have sufficient free space to hold all of the incoming data. Thus,the user is required to manage the data by recording the locations (inthe various data storage areas) of each part of the data. Accordingly, aneed exists for a method for extending the size of a file beyond a filesize limitation imposed by the file system used by a computer system.

SUMMARY OF THE INVENTION

[0009] One aspect of the present invention is directed to a method andapparatus for creating a virtual storage volume having a size that isindependent of a file size limitation of a computer system. A storagearea is randomly selected from among a plurality of available storageareas. A determination is made as to whether the selected storage areacontains at least a predetermined amount of free space. If so, freespace corresponding to the predetermined amount is allocated on theselected storage area. A symbolic link to the allocated storage area iswritten in a directory associated with the virtual storage volume. Datadestined for the virtual storage volume can then be written in theallocated storage area. Once the allocated storage area has beenexhausted, the above steps can be repeated to create another allocatedstorage area, with a symbolic link thereto being written in thedirectory associated with the virtual storage volume. Thus, the size ofthe virtual storage volume can exceed the file size limitation of thecomputer system.

[0010] Another aspect of the present invention is directed to a methodand apparatus for performing a file operation with respect to a virtualstorage volume having a file size that is independent of a file sizelimitation of a computer system. A symbolic link, which is located in adirectory associated with the virtual storage volume, is read. Thesymbolic link points to an allocated storage area, which had beenrandomly selected from among a plurality of available storage areas. Thefile operation is then performed with respect to the allocated storagearea.

BRIEF DESCRIPTION OF THE DRAWINGS

[0011] A more complete understanding of the present invention and itsadvantages will be readily apparent from the following DetailedDescription of the Preferred Embodiments taken in conjunction with theaccompanying drawings. Throughout the accompanying drawings, like partsare designated by like reference numbers and in which:

[0012]FIG. 1 is a schematic illustration of a computer network inaccordance with the present invention;

[0013]FIG. 2 is a flow diagram illustrating a method for creating avirtual storage volume in accordance with the present invention;

[0014]FIG. 3 is a block diagram illustrating a user's directory andshowing symbolic links from the user's directory to a plurality of filesystems in accordance with the present invention; and

[0015]FIG. 4 is a flow diagram illustrating a method for performing afile operation with respect to a virtual storage volume in accordancewith the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0016]FIG. 1 schematically illustrates a hardware environment of anembodiment of the present invention. A first communications network 110provides an electronic communications medium that connects a user at aclient computer 120 to first and second server computers 130, 140. Thefirst and second server computers 130, 140 respectively run first andsecond operating systems have respective file systems associatedtherewith for interfacing to first and second storage arrays 135, 145.The first and second operating systems can be different operatingsystems or can be the same operating system.

[0017] The client computer 120 is also connected to a third servercomputer 150 via a second communications network 160. The third servercomputer 150 runs a third operating system having an associated filesystem for interfacing to a third storage array 155. Each of the first,second, and third storage arrays 135, 145, 155 can include a pluralityof storage devices or can be a single storage device. Storage device asused herein can encompass hard disk drives, tape drives, solid statememory devices, or other types of storage devices. The third operatingsystem can be different from the first and second operating systems orcan be the same operating system.

[0018] The client computer 120 can be a personal computer or aworkstation. The server computers 130, 140, 150 can be personalcomputers, workstations, minicomputers, or mainframes. The clientcomputer 120 and the server computers 130, 140, 150 can bebi-directionally coupled to the first and second communications networks110, 160 over communications lines, via wireless systems, or anycombination thereof. For example, client computer 120 and the servercomputers 130, 140, 150 can be coupled to one another by various privatenetworks, public networks or any combination thereof, includinglocal-area networks (LANs), wide-area networks (WANs), or the Internet.Those skilled in the art will recognize many modifications that can bemade to this configuration without departing from the scope of thepresent invention.

[0019] A virtual storage volume can be created in accordance with thepresent invention, for example, by a user at the client computer 120creating a file in the user's home directory, or loading data from atape or other storage device to a file in the user's home directory.Conventionally, files located in the user's home directory areconventionally subject to a number of constraints: the maximum filesize, the maximum file-system size, and the size of the user's localhard disk drive. The present invention advantageously allows the user todisregard these conventional constraints and to treat the user's homedirectory as a substantially unlimited data storage area. Thus, the usercan process and load data by manipulating what appears to the user to bea single file within the user's home directory structure.

[0020]FIG. 2 is a flow diagram illustrating the operation of a computer,in creating a virtual storage volume in accordance with the presentinvention. The virtual storage volume can be created in a directoryassociated with a particular user, and can be created by the computer asa result of the user creating a file or writing to a file in the user'shome directory. The method of creating the virtual storage volume can beimplemented in a computer program, or alternatively, can be implementedas a subroutine library, which is called by a computer program andintegrated into the computer program.

[0021] The process starts at step 200. In step 210, the computerreceives a list of available storage areas which can be located invarious file systems. The list can indicate only those storage areasthat a particular user is permitted to access, or can indicate at leasta portion of the storage areas that are connected to and accessible bythe computer via one or more networks, such as the first and secondcommunications networks 110, 160. In step 215, the computer randomlyselects a storage area from among the available storage areas listed.The computer can also randomly select a storage area without employingthe list of available storage areas. In this case, the computer canselect a storage area from among the storage areas to which it isconnected or has access.

[0022] Then, in step 220, the computer determines whether the selectedstorage area contains at least a minimum amount of free (e.g., unused orunallocated) space. The amount of free space required can be determinedby a system administrator based on the total amount of storage spaceavailable, or can be based on other considerations. For example, the useof about a 250 megabyte block of space can result in more efficientutilization of available storage space, while the use of about a 2gigabyte block can result in more efficient data transfer. The systemadministrator can set the required amount of free space via a parameter(e.g., a block size parameter) associated with the virtual storagevolume. The computer can then look for an amount of free space in theselected storage area based on information representing the block size,which is stored in the parameter. The system administrator can alsoadjust the size of the block (e.g., based on currently availableresources, such as the amount of available hard drive space) by editingthe data stored in the block size parameter.

[0023] If the selected storage area has at least the required amount offree space available (Yes in step 220), the process proceeds to step225. If the selected storage area does not have at least thepredetermined amount available (No in step 220), the process returns tostep 215, wherein the computer randomly selects another storage area.

[0024] By randomly selecting a storage area, rather than selecting thestorage area having the most amount of free space, the process of thepresent invention advantageously distributes input/output activity amongthe various server computers 130, 140, 150 and associated storage arrays135, 145, 155. The random selection of a storage area provides theadvantage that, even when multiple users are creating their respectivevirtual storage volumes, the allocated storage areas are likely to be indifferent file systems. Thus, multiple users are not likely to beaccessing the same file systems at the same time, which can cause aninput/output bottleneck to occur.

[0025] Then, in step 225, the computer allocates an amount of free spacein the selected storage area that is equal to the required amount. Theamount of free space allocated can be determined, for example, by theblock size parameter. In step 230, the computer writes a symbolic linkto the allocated storage area in a directory associated with the virtualstorage volume. The symbolic link points to and enables access to theallocated storage area.

[0026] In step 235, the computer writes the data to be stored in thevirtual storage volume in the allocated storage area. As data is writteninto the allocated storage area, the amount of space remaining willdiminish. As long as there is space available in the allocated storagearea, additional data can be placed there. However, once there is nomore space remaining in the allocated storage area (No in step 240), theprocess returns to step 215 wherein another storage area is randomlyselected. Alternatively, the process can return to step 210 to receive alist of available storage areas, which list can be changed, for example,due to network conditions, since the last time that a storage area wasselected.

[0027] The process thus allows a large number of allocated storage areasto be linked to a single virtual storage volume, such that the totalamount of space available for a single file (i.e., the aggregate of theallocated storage areas) is greater than the maximum allowable file sizeas determined by the operating system. Further, as the allocated storageareas can be on different file systems, the total amount of spaceavailable can be greater than the maximum size of any particular filesystem.

[0028]FIG. 3 is a block diagram illustrating the relationship between avirtual storage volume, which is located in a user's directory 300, andthe actual data, which are distributed between various ones of thefirst, second, and third storage devices 135, 145, 155. The user'sdirectory 300 contains a virtual storage volume named “UserFile”, whichconsists of a plurality of files, respectively named “UserFile.DAT”,“UserFile.001”, “UserFile.002”, . . . “UserFile.xxx”. Each of the files“UserFile.DAT”, “UserFile.001”, “UserFile.002”, . . . “UserFile.xxx”contains a symbolic link that points to a storage area on one of thefirst, second, or third storage arrays 135, 145, 155.

[0029] More specifically, in this embodiment, the file “UserFile.DAT”contains a symbolic link 325 that points to a disk DISK6 in the firststorage array 135. Likewise, the files “UserFile.001”, “UserFile.002”,and “UserFile.xxx” respectively contain symbolic links 330, 335, 340that respectively point to DISK1 in the third storage array 155, DISK2in the first storage array 135, and DISK3 in the second storage array145. As the allocated storage areas are randomly selected from among theavailable storage areas, the files associated with the virtual storagevolume are randomly interspersed among the first, second, and thirdstorage devices 135, 145, 155.

[0030] In the embodiment of FIG. 3, the allocated storage areas and thedirectory containing the virtual storage volume are located in differentfile systems. In other embodiments, depending on the size of the userdirectory, at least a portion of the allocated storage areas can belocated in the same file system as the directory containing the virtualstorage volume.

[0031] While the embodiment of FIG. 3 allows the user to see each of theplurality of files that make up the virtual storage volume, in otherembodiments, the constituent files of the virtual storage volume can behidden from the user. The user can thus only see one file in the userdirectory for each virtual storage volume.

[0032]FIG. 4 is a flow diagram illustrating the operation of a computer,in performing a file operation with respect to a virtual storage volumein accordance with the present invention. The virtual storage volume canbe located in a directory associated with a particular user, and canappear to the user to be a file in his home directory. The method ofperforming a file operation with the virtual storage volume can beimplemented in a computer program, or alternatively, can be implementedas a subroutine library, which is called by a computer program andintegrated into the computer program.

[0033] The process starts at step 400. In step 410, the computerreceives an instruction to perform a file operation with respect to thevirtual storage volume. For example, the instruction can be to read orload into memory a virtual storage volume named “UserFile”.

[0034] In step 420, the computer locates a symbolic link associated withthe virtual storage volume. In the embodiment of FIG. 3, virtual volume“UserFile” includes a file named “UserFile.DAT” containing a symboliclink 325. Then, in step 430, the computer reads the symbolic link (e.g.,symbolic link 325 of the file “UserFile.DAT”), which points to anallocated storage area (e.g., DISK6 of the first data storage area 135).The file containing the symbolic link is preferably located in adirectory associated with the virtual storage volume, such as the user'sdirectory.

[0035] Then, in step 440, the computer locates the allocated storagearea pointed to by the symbolic link. The computer then performs therequested file operation (step 450) with respect to the allocatedstorage area pointed to by the symbolic link (e.g., DISK 6 of the firstdata storage area 135). Once the file operation has been performed withrespect to a first allocated storage area of the virtual storage volume,the process returns to step 430 to read a second file associated withthe virtual storage volume (e.g., “UserFile.001”), locate the nextallocated storage area, and perform the file operation with respect tothe allocated storage area. Thus, the process repeats steps 430 through440 until the file operation has been performed with respect to all thefiles of the virtual storage volume (No in step 460). The process thenends at step 470.

[0036] Although the present invention has been fully described by way ofexamples and with reference to the accompanying drawings, it is to beunderstood that various changes and modifications will be apparent tothose skilled in the art without departing from the spirit and scope ofthe invention. Therefore, unless such changes and modifications departfrom the scope of the present invention, they should be construed asbeing included therein.

What is claimed is:
 1. A method, using a computer system, for creating a virtual storage volume having a size independent of a file size limitation of said computer system, the method comprising the steps of: randomly selecting a storage area from among a plurality of available storage areas; determining whether said selected storage area contains at least a predetermined amount of free space; allocating said predetermined amount of said free space on said selected storage area to create an allocated storage area; writing a symbolic link to said allocated storage area in a directory associated with said virtual storage volume; and writing data destined for said virtual storage volume in said allocated storage area.
 2. A method in accordance with claim 1, wherein said size of said virtual storage volume is independent of a file-system size limitation of said computer system.
 3. A method in accordance with claim 1, wherein said allocated storage area and said directory are located on different file systems.
 4. A method in accordance with claim 1, wherein said step of randomly selecting a storage area comprises randomly selecting said storage area from a list of said plurality of available storage areas.
 5. A method in accordance with claim 1, wherein said plurality of available storage areas are connected to said computer system via a communications network.
 6. A method in accordance with claim 1, wherein said predetermined amount is determined by a parameter associated with said virtual storage volume.
 7. A method in accordance with claim 1, wherein data to be written to said allocated storage area exceeds an amount of free space remaining in said allocated storage area, said method further comprising the steps of: randomly selecting a second storage area from among said plurality of available storage areas; determining whether said second selected storage area contains at least said predetermined amount of free space; allocating said predetermined amount of said free space on said second selected storage area to create a second allocated storage area; writing a second symbolic link to said second allocated storage area in said directory; and writing at least a portion of said data to said second allocated storage area.
 8. A method in accordance with claim 7, wherein said allocated storage area, said second allocated storage area, and said directory are located on different file systems.
 9. A computer readable medium, having computer executable instructions stored therein, for creating a virtual storage volume having a size independent of a file size limitation of a computer, said computer executable instructions comprising: instructions for randomly selecting a storage area from among a plurality of available storage areas; instructions for determining whether said selected storage area contains at least a predetermined amount of free space; instructions for allocating said predetermined amount of said free space on said selected storage area to create an allocated storage area; instructions for writing a symbolic link to said allocated storage area in a directory associated with said virtual storage volume; and instructions for writing data destined for said virtual storage volume in said allocated storage area.
 10. A computer readable medium in accordance with claim 9, wherein said size of said virtual storage volume is independent of a file-system size limitation of said computer system.
 11. A computer readable medium in accordance with claim 9, wherein said allocated storage area and said directory are located on different file systems.
 12. A computer readable medium in accordance with claim 9, wherein said instructions for randomly selecting comprises instructions for randomly selecting said storage area from a list of said plurality of available storage areas.
 13. A computer readable medium in accordance with claim 9, wherein said plurality of available storage areas are connected to said computer via a communications network.
 14. A computer readable medium in accordance with claim 9, wherein said predetermined amount is determined by a parameter associated with said virtual storage volume.
 15. A computer readable medium in accordance with claim 9, wherein data to be written to said allocated storage area exceeds an amount of free space remaining in said allocated storage area, said computer executable instructions further comprising: instructions for randomly selecting a second storage area from among said plurality of available storage areas; instructions for determining whether said second selected storage area contains at least said predetermined amount of free space; instructions for allocating said predetermined amount of said free space on said second selected storage area to create a second allocated storage area; instructions for writing a second symbolic link to said second allocated storage area in said directory; and instructions for writing said data to said second allocated storage area.
 16. A computer readable medium in accordance with claim 15, wherein said allocated storage area, said second allocated storage area, and said directory are located on different file systems.
 17. A method, using a computer system, for creating a virtual storage volume having a size independent of a file size limitation of said computer system, the method comprising the steps of: (a) randomly selecting a storage area from among a plurality of available storage areas; (b) determining whether said selected storage area contains at least a predetermined amount of free space; (c) allocating said predetermined amount of said free space on said selected storage area to create an allocated storage area; (d) writing a symbolic link to said allocated storage area in a directory associated with said virtual storage volume; (e) writing data destined for said virtual storage volume in said allocated storage area; and (f) repeating steps (a) through (d) when data to be written to said allocated storage area exceeds an amount of free space remaining on said allocated storage area so as to allow said size of said virtual storage volume to exceed said file size limitation.
 18. A method in accordance with claim 17, wherein said size of said virtual storage volume is independent of a file-system size limitation of said computer system.
 19. A method in accordance with claim 17, wherein said allocated storage area and said directory are located on different file systems.
 20. A method in accordance with claim 17, wherein said plurality of available storage areas are connected to said computer system via a communications network.
 21. A computer readable medium, having computer executable instructions stored therein, for creating a virtual storage volume having a size independent of a file size limitation of said computer, said computer executable instructions comprising: (a) instructions for randomly selecting a storage area from among a plurality of available storage areas; (b) instructions for determining whether said selected storage area contains at least a predetermined amount of free space; (c) instructions for allocating said predetermined amount of said free space on said selected storage area to create an allocated storage area; (d) instructions for writing a symbolic link to said allocated storage area in a directory associated with said virtual storage volume; (e) instructions for writing data destined for said virtual storage volume in said allocated storage area; and (f) instructions for repeatedly performing instructions (a) through (d) when data to be written to said allocated storage area exceeds an amount of free space remaining on said allocated storage area so as to allow said size of said virtual storage volume to exceed said file size limitation.
 22. A computer readable medium in accordance with claim 21, wherein said size of said virtual storage volume is independent of a file-system size limitation of said computer system.
 23. A computer readable medium in accordance with claim 21, wherein said allocated storage area and said directory are located on different file systems.
 24. A computer readable medium in accordance with claim 21, wherein said plurality of available storage areas are connected to said computer system via a communications network.
 25. A method, using a computer system, for performing a file operation with respect to a virtual storage volume, the method comprising the steps of: reading, in a directory associated with said virtual storage volume, a symbolic link to an allocated storage area; and performing said file operation with respect to said allocated storage area; wherein said virtual storage volume has a size independent of a file size limitation of said computer system; and wherein said allocated storage area is randomly selected from among a plurality of available storage areas.
 26. A method in accordance with claim 25, wherein said virtual storage volume is independent of a file-system size limitation of said computer system.
 27. A method in accordance with claim 25, wherein said allocated storage area and said directory are located on different file systems.
 28. A method in accordance with claim 25, wherein said allocated storage area is randomly selected from a list of said plurality of available storage areas.
 29. A method in accordance with claim 25, wherein said plurality of available storage areas are connected to said computer system via a communications network.
 30. A method in accordance with claim 25, wherein said allocated storage area has a size determined by a parameter associated with said virtual storage volume.
 31. A computer readable medium, having computer executable instructions stored therein, for performing a file operation with respect to a virtual storage volume, said computer executable instructions comprising: instructions for reading, in a directory associated with said virtual storage volume, a symbolic link to an allocated storage area; and instructions for performing said file operation with respect to said allocated storage area; wherein said virtual storage volume has a size independent of a file size limitation of said computer system; and wherein said allocated storage area is randomly selected from among a plurality of available storage areas.
 32. A computer readable medium in accordance with claim 31, wherein said virtual storage volume is independent of a file-system size limitation of said computer system.
 33. A computer readable medium in accordance with claim 31, wherein said allocated storage area and said directory are located on different file systems.
 34. A computer readable medium in accordance with claim 31, wherein said allocated storage area is randomly selected from a list of said plurality of available storage areas.
 35. A computer readable medium in accordance with claim 31, wherein said plurality of available storage areas are connected to said computer system via a communications network.
 36. A computer readable medium in accordance with claim 31, wherein said allocated storage area has a size determined by a parameter associated with said virtual storage volume. 